Grenada
Quality-Aware Translation Tagging in Multilingual RAG system
Moon, Hoyeon, Kim, Byeolhee, Verma, Nikhil
Multilingual Retrieval-Augmented Generation (mRAG) often retrieves English documents and translates them into the query language for low-resource settings. However, poor translation quality degrades response generation performance. Existing approaches either assume sufficient translation quality or utilize the rewriting method, which introduces factual distortion and hallucinations. To mitigate these problems, we propose Quality-Aware Translation Tagging in mRAG (QTT-RAG), which explicitly evaluates translation quality along three dimensions-semantic equivalence, grammatical accuracy, and naturalness&fluency-and attach these scores as metadata without altering the original content. We evaluate QTT-RAG against CrossRAG and DKM-RAG as baselines in two open-domain QA benchmarks (XORQA, MKQA) using six instruction-tuned LLMs ranging from 2.4B to 14B parameters, covering two low-resource languages (Korean and Finnish) and one high-resource language (Chinese). QTT-RAG outperforms the baselines by preserving factual integrity while enabling generator models to make informed decisions based on translation reliability. This approach allows for effective usage of cross-lingual documents in low-resource settings with limited native language documents, offering a practical and robust solution across multilingual domains.
- Europe > Norway (0.14)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Sweden (0.04)
- (12 more...)
- Leisure & Entertainment (1.00)
- Government (0.93)
- Media > Film (0.68)
Identifying Emerging Concepts in Large Corpora
We introduce a new method to identify emerging concepts in large text corpora. By analyzing changes in the heatmaps of the underlying embedding space, we are able to detect these concepts with high accuracy shortly after they originate, in turn outperforming common alternatives. We further demonstrate the utility of our approach by analyzing speeches in the U.S. Senate from 1941 to 2015. Our results suggest that the minority party is more active in introducing new concepts into the Senate discourse. We also identify specific concepts that closely correlate with the Senators' racial, ethnic, and gender identities. An implementation of our method is publicly available.
- Asia > Russia (0.28)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Spain (0.04)
- (20 more...)
- Law > Statutes (1.00)
- Law > Environmental Law (1.00)
- Law > Civil Rights & Constitutional Law (1.00)
- (5 more...)
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator
Li, Chengshu, Liang, Jacky, Zeng, Andy, Chen, Xinyun, Hausman, Karol, Sadigh, Dorsa, Levine, Sergey, Fei-Fei, Li, Xia, Fei, Ichter, Brian
Code provides a general syntactic structure to build complex programs and perform precise computations when paired with a code interpreter - we hypothesize that language models (LMs) can leverage code-writing to improve Chain of Thought reasoning not only for logic and arithmetic tasks, but also for semantic ones (and in particular, those that are a mix of both). For example, consider prompting an LM to write code that counts the number of times it detects sarcasm in an essay: the LM may struggle to write an implementation for "detect_sarcasm(string)" that can be executed by the interpreter (handling the edge cases would be insurmountable). However, LMs may still produce a valid solution if they not only write code, but also selectively "emulate" the interpreter by generating the expected output of "detect_sarcasm(string)" and other lines of code that cannot be executed. In this work, we propose Chain of Code (CoC), a simple yet surprisingly effective extension that improves LM code-driven reasoning. The key idea is to encourage LMs to format semantic sub-tasks in a program as flexible pseudocode that the interpreter can explicitly catch undefined behaviors and hand off to simulate with an LM (as an "LMulator"). Experiments demonstrate that Chain of Code outperforms Chain of Thought and other baselines across a variety of benchmarks; on BIG-Bench Hard, Chain of Code achieves 84%, a gain of 12% over Chain of Thought. CoC scales well with large and small models alike, and broadens the scope of reasoning questions that LMs can correctly answer by "thinking in code". Project webpage: https://chain-of-code.github.io.
- North America > The Bahamas (0.14)
- North America > Barbados (0.14)
- Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)
- (65 more...)
- Oceania > Australia (0.06)
- South America > Venezuela (0.05)
- North America > United States > District of Columbia > Washington (0.05)
- (8 more...)